6 research outputs found

    Towards Data Reliable, Low-Power, and Repairable Resistive Random Access Memories

    Get PDF
    A series of breakthroughs in memristive devices have demonstrated the potential of memristor arrays to serve as next generation resistive random access memories (ReRAM), which are fast, low-power, ultra-dense, and non-volatile. However, memristors' unique device characteristics also make them prone to several sources of error. Owing to the stochastic filamentary nature of memristive devices, various recoverable errors can affect the data reliability of a ReRAM. Permanent device failures further limit the lifetime of a ReRAM. This dissertation developed low-power solutions for more reliable and longer-enduring ReRAM systems. In this thesis, we first look into a data reliability issue known as write disturbance. Writing into a memristor in a crossbar could disturb the stored values in other memristors that are on the same memory line as the target cell. Such disturbance is accumulative over time which may lead to complete data corruption. To address this problem, we propose the use of two regular memristors on each word to keep track of the disturbance accumulation and trigger a refresh to restore the weakened data, once it becomes necessary. We also investigate the considerable variation in the write-time characteristics of individual memristors. With such variation, conventional fixed-pulse write schemes not only waste significant energy, but also cannot guarantee reliable completion of the write operations. We address such variation by proposing an adaptive write scheme that adjusts the width of the write pulses for each memristor. Our scheme embeds an online monitor to detect the completion of a write operation and takes into account the parasitic effect of line-shared devices in access-transistor-free memristive arrays. We further investigate the use of this method to shorten the test time of memory march algorithms by eliminating the need of a verifying read right after a write, which is commonly employed in the test sequences of march algorithms.Finally, we propose a novel mechanism to extend the lifetime of a ReRAM by protecting it against hard errors through the exploitation of a unique feature of bipolar memristive devices. Our solution proposes an unorthodox use of complementary resistive switches (a particular implementation of memristive devices) to provide an ``in-place spare'' for each memory cell at negligible extra cost. The in-place spares are then utilized by a repair scheme to repair memristive devices that have failed at a stuck-at-ON state at a page-level granularity. Furthermore, we explore the use of in-place spares in lieu of other memory reliability and yield enhancement solutions, such as error correction codes (ECC) and spare rows. We demonstrate that with the in-place spares, we can yield the same lifetime as a baseline ReRAM with either significantly fewer spare rows or a lighter-weight ECC, both of which can save on energy consumption and area

    Toward large-scale access-transistor-free memristive crossbars

    Full text link
    Abstract — Memristive crossbars have been shown to be excel-lent candidates for building an ultra-dense memory system be-cause a per-cell access-transistor may no longer be necessary. However, the elimination of the access-transistor introduces sev-eral parasitic effects due to the existence of partially-selected de-vices during memory accesses, which could limit the scalability of access-transistor-free (ATF) memristive crossbars. In this paper we discuss these challenges in detail and describe some solutions addressing these challenges at multiple levels of design abstrac-tion. I

    End-to-end error correction and online diagnosis for on-chip networks

    No full text
    Abstract In an on-chip network, roughly 80 % of the communication faults are transient [9]. Different fault tolerance approaches such as Forward Error Control (FEC), Automatic Repeat Query (ARQ), and multi-path routing have been used and compared in literature for reliable on-chip transmission [15-17]. These approaches tolerate transient faults, but they become ineffective in the presence of permanent faults. Permanent faults on wires occur both during manufacturing and in the field, causing yield degradation and service costs respectively. The overall system cost can be reduced by adding some spare wires per each link of the network to replace the defective wires [15,18]. Nevertheless, an in-field diagnosis mechanism is required to locate the defective wire and initiates the wire replacement. We propose a comprehensive solution for end-to-end (e2e) error correction and online defect diagnosis for on-chip networks. For e2e error correction, we propose an interleaved error-locality-aware code that efficiently corrects both random and burst errors. We demonstrate that for 64-bit wide network links, interleaving four of the proposed code, 2G4L(26,16), each of which supports 16-bit data, can correct as many as two random errors or 16 adjacent errors. In order to maintain the error correction capability of the Error Correcting Code (ECC) for transient and intermittent errors, we further propose an e2e data gathering and online diagnosis approach that locates the defective wires and replaces them with the spare wires embedded in the network. Our analytical and experimental studies show that under heavy noise, high escape rate, uncertainty about routing, and many other harmful effects, the diagnostic data collected by the proposed approach are accurate enough for the purpose of passive diagnosis. 1

    Energy-Efficient GPGPU Architectures via Collaborative Compilation and Memristive Memory-Based Computing

    No full text
    Thousands of deep and wide pipelines working concurrently make GPGPU high power consuming parts. Energy-efficiency techniques employ voltage overscaling that increases timing sensitivity to variations and hence aggravating the energy use issues. This paper proposes a method to increase spa-tiotemporal reuse of computational effort by a combination of compilation and micro-architectural design. An associa-tive memristive memory (AMM) module is integrated with the floating point units (FPUs). Together, we enable fine-grained partitioning of values and find high-frequency sets of values for the FPUs by searching the space of possible in-puts, with the help of application-specific profile feedback. For every kernel execution, the compiler pre-stores these high-frequent sets of values in AMM modules – represent-ing partial functionality of the associated FPU – that are concurrently evaluated over two clock cycles. Our simula-tion results show high hit rates with 32-entry AMM modules that enable 36 % reduction in average energy use by the ker-nel codes. Compared to voltage overscaling, this technique enhances robustness against timing errors with 39 % average energy saving

    Associative Memristive Memory for Approximate Computing in GPUs

    No full text
    Using associative memories to enable computing-with-memory is a promising approach to improve energy efficiency. Associative memories can be tightly coupled with processing elements to restore and later recall function responses for a subset of input values. This approach avoids the actual function execution on the processing element to save on energy. The challenge, however, is to reduce the energy consumption of associative memory modules themselves. Here we address the challenge of designing ultra-low-power associative memories. We use memristive parts for memory implementation and demonstrate the energy saving potential of integrating associative memristive memory (AMM) into graphics processing units (GPUs). To reduce the energy consumption of AMM modules, we leverage approximate computing which benefits from application-level tolerance to errors: We employ voltage overscaling on AMM modules which deliberately relaxes its searching criteria to approximately match stored patterns within a 2 bit Hamming distance of the search pattern. This introduces some errors to the computation that are tolerable for target applications. We further reduce the energy consumption by employing purely resistive crossbar architectures for AMM modules. To evaluate the proposed architecture, we integrate AMM modules with floating point units in an AMD Southern Islands GPU and run four image processing kernels on an AMM-integrated GPU. Our experimental results show that employing AMM modules reduces energy consumption of running these kernels by 23%-45%, compared to a baseline GPU without AMM. The image processing kernels tolerate errors resulting from approximate search operations, maintaining an acceptable image quality, i.e., a PSNR above 30 dB
    corecore